710 research outputs found

    One-Class Support Measure Machines for Group Anomaly Detection

    Full text link
    We propose one-class support measure machines (OCSMMs) for group anomaly detection which aims at recognizing anomalous aggregate behaviors of data points. The OCSMMs generalize well-known one-class support vector machines (OCSVMs) to a space of probability measures. By formulating the problem as quantile estimation on distributions, we can establish an interesting connection to the OCSVMs and variable kernel density estimators (VKDEs) over the input space on which the distributions are defined, bridging the gap between large-margin methods and kernel density estimators. In particular, we show that various types of VKDEs can be considered as solutions to a class of regularization problems studied in this paper. Experiments on Sloan Digital Sky Survey dataset and High Energy Particle Physics dataset demonstrate the benefits of the proposed framework in real-world applications.Comment: Conference on Uncertainty in Artificial Intelligence (UAI2013

    Comment on "Support Vector Machines with Applications"

    Full text link
    Comment on ``Support Vector Machines with Applications'' [math.ST/0612817]Comment: Published at http://dx.doi.org/10.1214/088342306000000484 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Deep Nonlinear Non-Gaussian Filtering for Dynamical Systems

    Full text link
    Filtering is a general name for inferring the states of a dynamical system given observations. The most common filtering approach is Gaussian Filtering (GF) where the distribution of the inferred states is a Gaussian whose mean is an affine function of the observations. There are two restrictions in this model: Gaussianity and Affinity. We propose a model to relax both these assumptions based on recent advances in implicit generative models. Empirical results show that the proposed method gives a significant advantage over GF and nonlinear methods based on fixed nonlinear kernels

    The representer theorem for Hilbert spaces: a necessary and sufficient condition

    Full text link
    A family of regularization functionals is said to admit a linear representer theorem if every member of the family admits minimizers that lie in a fixed finite dimensional subspace. A recent characterization states that a general class of regularization functionals with differentiable regularizer admits a linear representer theorem if and only if the regularization term is a non-decreasing function of the norm. In this report, we improve over such result by replacing the differentiability assumption with lower semi-continuity and deriving a proof that is independent of the dimensionality of the space

    Submodular Inference of Diffusion Networks from Multiple Trees

    Full text link
    Diffusion and propagation of information, influence and diseases take place over increasingly larger networks. We observe when a node copies information, makes a decision or becomes infected but networks are often hidden or unobserved. Since networks are highly dynamic, changing and growing rapidly, we only observe a relatively small set of cascades before a network changes significantly. Scalable network inference based on a small cascade set is then necessary for understanding the rapidly evolving dynamics that govern diffusion. In this article, we develop a scalable approximation algorithm with provable near-optimal performance based on submodular maximization which achieves a high accuracy in such scenario, solving an open problem first introduced by Gomez-Rodriguez et al (2010). Experiments on synthetic and real diffusion data show that our algorithm in practice achieves an optimal trade-off between accuracy and running time.Comment: To appear in the 29th International Conference on Machine Learning (ICML), 2012. Website: http://www.stanford.edu/~manuelgr/network-inference-multitree

    Causal Inference on Discrete Data using Additive Noise Models

    Full text link
    Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. Recently, methods using additive noise models have been suggested to approach the case of continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work we extend the notion of additive noise models to these cases. We prove that whenever the joint distribution \prob^{(X,Y)} admits such a model in one direction, e.g. Y=f(X)+N, N \independent X, it does not admit the reversed model X=g(Y)+\tilde N, \tilde N \independent Y as long as the model is chosen in a generic way. Based on these deliberations we propose an efficient new algorithm that is able to distinguish between cause and effect for a finite sample of discrete variables. In an extensive experimental study we show that this algorithm works both on synthetic and real data sets

    Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions

    Full text link
    Kernel mean embeddings have recently attracted the attention of the machine learning community. They map measures μ\mu from some set MM to functions in a reproducing kernel Hilbert space (RKHS) with kernel kk. The RKHS distance of two mapped measures is a semi-metric dkd_k over MM. We study three questions. (I) For a given kernel, what sets MM can be embedded? (II) When is the embedding injective over MM (in which case dkd_k is a metric)? (III) How does the dkd_k-induced topology compare to other topologies on MM? The existing machine learning literature has addressed these questions in cases where MM is (a subset of) the finite regular Borel measures. We unify, improve and generalise those results. Our approach naturally leads to continuous and possibly even injective embeddings of (Schwartz-) distributions, i.e., generalised measures, but the reader is free to focus on measures only. In particular, we systemise and extend various (partly known) equivalences between different notions of universal, characteristic and strictly positive definite kernels, and show that on an underlying locally compact Hausdorff space, dkd_k metrises the weak convergence of probability measures if and only if kk is continuous and characteristic.Comment: Old and longer version of the JMLR paper with same title (published 2018). Please start with the JMLR version. 55 pages (33 pages main text, 22 pages appendix), 2 tables, 1 figure (in appendix
    corecore